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Abstract — Motivated by the significant performance gains 
which polar codes experience when they are decoded with 
successive cancellation list decoders, we study how the scaling 
exponent changes as a function of the list size L. In particular, 
we fix the block error probability P e and we analyze the tradeoff 
between the blocklength N and the back-off from capacity C — R 
using scaling laws. By means of a Divide and Intersect procedure, 
we provide a lower bound on the error probability under MAP 
decoding with list size L for any binary-input memoryless output- 
symmetric channel and for any class of linear codes such that 
their minimum distance is unbounded as the blocklength grows 
large. We show that, although list decoding can significantly 
improve the involved constants, the scaling exponent itself, i.e., 
the speed at which capacity is approached, stays unaffected. This 
result applies in particular to polar codes, since their minimum 
distance tends to infinity as N increases. Some considerations 
are also pointed out for the genie-aided successive cancellation 
decoder when transmission takes place over the binary erasure 
channel. 

I. Introduction 

Error Exponent and Scaling Exponent. While studying 
the error performance of a code family when transmission 
takes place over a binary-input memoryless output-symmetric 
channel (BMSC) W with Shannon capacity C, the parameters 
of interest are, in general, the rate R, the blocklength N, 
and the block error probability P e . Ideally, we would like 
to characterize P e (N, R, W) exactly as a function of its 
parameters, in particular TV and R, but this is hard to achieve. 
A slightly easier task is to fix one of the quantities (P e , N, R) 
and then to explore the trade-off between the remaining two. 

The oldest such approach is to compute the error exponent: 
we fix the rate R and we are interested in the trade-off between 
P e and N, In particular, we compute how P e behaves when 
TV tends to infinity. For various standard classical random 
ensembles (e.g., the Shannon ensemble or the Fano ensemble, 
see ITJ), it is well known that P e tends to exponentially fast 
in the blocklength, i.e., P e = 0{ e - aN ), for any < R < C. 
For a fairly recent survey on how to determine a for various 
such ensembles, we refer to [2|. 

However, the error exponent gives only limited guidance 
for the design of practical coding systems, since it concerns 
the behavior of the error probability once it has already 
reached very low values. From an engineering point of view, 
the following alternate analysis proves more fruitful: fix P e 
and study how the blocklength N scales with the gap from 



capacity C—R. This approach has been successfully applied to 
iteratively decoded LDPC ensembles [3], where it was dubbed 
the scaling law paradigm, following a terminology coming 
from statistical physics: if a system goes through a phase 
transition as a control parameter R crosses a critical value C, 
then generically around this point there exists a very specific 
scaling law. In formulae, we say that a scaling law holds for the 
block error probability P e (N,R,W) of a capacity-achieving 
code if there exists a function /, called the mother curve, and 
a constant p > 0, called the scaling exponent, such that 



lim 

iV-Voo: JVVm(C--R)= 



P e (N,R,W)=f(z). 



(1) 



As the blocklength increases, if a rate R < C is chosen, then 
P e (N,R,W) — > 0, since the code is supposed to achieve 
capacity. On the other hand, P e (N,R,W) —} 1 for any 
R > C. Equation ([T]) refines this basic observation, specifying 
the speed at which the rate converges to capacity, if a certain 
error probability is to be met: roughly speaking, the back-off 
from capacity C — R tends to at a speed of N^ 1 ^. For 
random ensembles, it is well known that the scaling exponent 
is /i = 2. 

List Decoding. List decoding, which was introduced in- 
dependently by Elias and Wozencraft pi, ||5), allows the 
receiver to collect L possible transmitted messages. An error 
is declared only if the correct message does not appear in the 
list. 

The error exponent of list decoding schemes has been 
widely studied in the literature [6|, [7 J, and for random coding 
it has been proved that the introduction of a list with finite 
size L does not yield any change in this asymptotic regime, 
provided that the rate is close enough to capacity [8 1. Improved 
bounds suitable for both random and structured linear block 
codes have been recently investigated (9). 

As concerns the scaling exponent, for a random ensem- 
ble transmitted over a Binary Erasure Channel with erasure 
probability e, namely a BEC(e), it can be shown that the 
error probability P e (N,R,e,L) is well approximated by the 



following expression, 
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where Q(x) — 1/s/2tt J °° exp (— u 2 /2)du. Consequently, 
the scaling exponent remains equal to 2 and also the mother 
curve stays unchanged, namely f(z) — Q(z/y / e(l — e)), for 
any L G N. 

Polar Codes: a Motivating Case Study. The present re- 
search was motivated by polar codes, which were recently 
introduced by Ankan in 1 10 1, and that achieve the capacity of 
a large class of channels by means of encoding and decoding 
algorithms with complexity Q(N\ogN). 

In particular, for any BMSC W, it has been proved that for 
any rate R < C and any j3 < 1/2, the block error probability 
under the proposed successive cancellation (SC) decoding, 
namely PJ? C (N, R, W), is upper bounded by 2~ Nf> for N 
large enough [11]. With an abuse of notation, it is said that 
polar codes achieve an error exponent of a = 1/2. This result 
has been further refined and extended to the MAP decoder, 
showing that both log 2 (— log 2 P e sc ) and log 2 (- log 2 P e MAP ) 
scale as N/2 + y/N/2-Q~ 1 (R/C)+o(y/N) for any fixed rate 
strictly less than capacity 1 12 1. Consequently, even at moderate 
blocklengths (for instance, N > 10 4 ), error floors should not 
be a problem for polar codes. 

However, when we consider rates close to capacity, sim- 
ulation results show that large blocklengths are required in 
order to achieve a desired error probability. Therefore, it is 
interesting to explore the tradeoff between rate and blocklength 
when the error probability is fixed, i.e., to consider the scaling 
approach. 

In fl3) , the authors provide strong evidence that a lower 
bound to the block error probability under SC decoding 
satisfies a scaling law ([TJ. In particular, a proper scaling 
assumption yields the inequalities 

f(N»(C-R)) <P* C (N,R,W) < N^^FiN^C-R)). 

(3) 
For transmission over the BEC, the asymptotic behavior of 
/(•) and P(-) for small values of their argument is provided, 
as well as an estimation for the scaling exponent is given, 
namely ji s=s 3.627. Therefore, compared to random and LDPC 
codes, which have a scaling exponent of 2 for a large class of 
parameters and channel models [3], polar codes require larger 
blocklengths to achieve the same rate and error probability. 
In addition, numerical results show that P^ C (N, R, W) is 
extremely close to the upper bound ((3), provided that such 
a value is not too big (< 1CP 1 suffices). 

For a generic BMSC, taking as a proxy of the error 
probability the sum of the Bhattacharyya parameters, it has 
been proved that there exists a universal parameter //, such 
that reliable communication requires rates that satisfy R < 



C — aN^ 1 /^ , where a is a positive constant |14) . The 
exponent p! is lower bounded by 3.553 and it has been 
conjectured that its value can be increased up to the scaling 
parameter of the BEC, i.e., // = /i « 3.627. 

In order to improve the finite-length performance of polar 
codes, a successive cancellation list (SCL) decoder was pro- 
posed in (15). Empirically, the usage of L concurrent decoding 
paths yields a significant improvement in the achievable error 
probability. Hence, it is interesting to study the behavior of 
the scaling exponent in the context of list decoding. 

Contribution of the Present Work. The main result concerns 
the behavior of the MAP decoder as a function of the list 
size. By means of a Divide and Intersect (DI) procedure, we 
prove that the error probability of the MAP decoder with 
list size L, namely P^ /lAP (N, R, W, L), is lower bounded 
by P^ AAF (N, R, W, L = 1) raised to an appropriate power 
times a suitable constant, both of which depend only on L. 
As a result, we see that list decoding has the potential of 
significantly improving the involved constants, but it does not 
change the scaling exponent. This is true for any BMSC W 
and any family of linear codes whose minimum distance grows 
arbitrarily large when the blocklength tends to infinity. Proving 
that the minimum distance of polar codes is unbounded in 
the limit N — > +oo, we deduce that these conclusions also 
hold for polar codes. As a side result, we show that the ideas 
developed for the analysis of MAP decoding with list can be 
applied to the study of the genie-aided SC decoder. 

Organization. Section |TT| states the DI bounds and their 
implications on the scaling exponent under MAP decoding 
with finite list size L. The proofs of these main results are 



contained in Section HI and IV for the BEC and for any 
BMSC, respectively. The analysis concerning genie-aided de- 
coding for transmission over the BEC is discussed in Section 
fy] The conclusions of the paper are provided in Section [VI] 

II. Main Results for MAP Decoding with List 

Let Cun be a set of linear codes parameterized by their 
blocklength N and rate R. For each N and R, let d m [ n (N, R) 
denote the minimum distance and let P e MAP (N, R, W, L) be 
the block error probability for transmission over a BMSC 
W with capacity C € (0, 1) and Bhattacharyya parameter 
Z G (0, 1), under MAP decoding with list size L. In addition, 
denote by C po i the set of polar codes when transmission takes 
place over the BMSC W. Throughout the paper, we are 
going to assume that the output alphabet A of W is finite. 
Since every BMS channel can be approximated to any desired 
accuracy by a BMS channel with finite output alphabet (ll, 
this assumption does not constitute a significant restriction. 

The case W = BEC(e) is handled separately. Indeed, for 
the BEC, we can assume that the list size L is a power of 
2, say L = 2 l , since MAP decoding reduces to solving a 
linear system over the finite field F2. Hence, for any LgN, 
the block error probability, denoted in this specific case by 
P^ AAP (N,R,e,L), does not change if we reduce the list 
size to 2L los 2 L J, and, therefore, the bounds can be tightened. 
Secondly, when dealing with a BEC, the DI approach itself 



and the proofs of the intermediate lemmas are considerably 
simpler, while keeping the same flavor as those valid for 
general BMSC. 

A. Divide and Intersect Bounds 

Theorem 1 (DI bound - C\[ n ): Consider transmission using 
elements in Cii„ over a BMSC W and set P e G (0, 1). For 
any N and R so that 



P^(N,R,W,L) > P e 



d min (N,R) > 



ln(P e /8) 



(4) 
(5) 



InZ ' 

the performance of the MAP decoder with list size L + 1 (2L, 
if W = BEC(e)) is lower bounded by 

3 



(6) 



P e MAP (7V, R, W,L + 1) > ^ • (P e MAP (7V, R, W, L)) d 
P^(N,R,e,2L) > A . (pMAP (iV)i?!£)L) )2_ 

Theorem 2 (DI bound - C po \): Consider transmission using 
elements in C po i over a BMSC W. Fix P e G (0, 1) and pick 

any N such that 

N>2 Hz,c,p e )^ (7) 



where 

n(Z, C, P e ) = 2m(Z, P e ) - ln(l - C) 



+ v/-4m(Z,P e ) • ln(l -C) + (ln(l - C)) 2 , 



(8) 



with 



ffi (Z,P,)-lo g J '^^-1, |, (9, 



vlnZ-ln 1-Z ml 



and any sufficiently large P so that 

F e MAP (iV, P, W, L) > P e . (10) 

Then, the bounds (|6]l hold. 

The following corollary follows by induction. 

Corollary 1 (DI bounds - any L): Consider transmission 
using elements in C\\ n over a BMSC W. Fix P e G (0, 1) and 
define the following recursion, 

P e (m + l) = ^(P e (m)) 2 , meN, (11) 

with the initial condition P e (l) = P e . Pick any TV and P 
such that Q and Q hold with P e (L) instead of P e , or, if 
the code is in C po i, any N satisfying |7]) and any sufficiently 
large P satisfying ( flO) with P e {L) instead of P e . Then, the 
performance of the MAP decoder with list size L + 1 is lower 
bounded by 

P e MAP (A,P,V^,i + l) 



3 \2 -i 



>Q •(p e MAP w^w,i=i)) ! 



(12) 



If W — BEC(e), consider the recursion ( fTTj ) with the initial 
condition P e (0) = P e . If (|4j>-(|5j and Q-([T0| are satisfied 
with P e (log 2 i-) instead of P e for codes in C\\ n and C po \, 



respectively, then the performance of the MAP decoder with 
list size 2L is lower bounded by 



P 



MAP 



(N,R,e,2L) 

/ 3 \2i-l 

1 16 J 



>fi 



(P e MAP (iV,P, £ ,L = l)) 



2L 



(13) 



B. Scaling Exponent of MAP Decoding with List 

An immediate consequence of the DI bounds is that the 
scaling exponent defined in ([T} does not change as long as L 
is fixed and finite. More formally, one can define the existence 
of a scaling law as follows. 

Definition 1 (Scaling law): Consider a set of codes, param- 
eterized by their blocklength N and rate P, transmitted over 
the channel W, and processed by means of the decoder V, 
and let Pf{N, R, W) denote the block error probability. We 
say that a scaling law holds, if there exist a real number 
/i G (0, +oo), namely the scaling exponent, and a function 
/ : M. — >• [0, 1], namely the mother curve, such that 



lim 

N^co: W 1 /**(C-fi)= 



P?{N,R,W) = f{z). (14) 



The proof of the theorem below that bounds the scaling 
behavior of the MAP decoder with any finite list size L is 
easily deduced from Corollary [T] 

Theorem 3 (Scaling exponent - MAP decoding with list): 
Consider the set of polar codes C po \ transmitted over a BMSC 
W. If the scaling law of Definition [T] holds for the MAP 
decoder with mother curve / and scaling exponent p, then 
for any LeN, 



lim sup 

N-yoo: JV 1 /f(C-fi)=z 



,MAP 



(N,R,W,L)<f(z), (15) 



lim inf P 



MAP 



(N,R,W,L) 



(£)' 



(m? 



(16) 



In words, if a scaling law holds for the MAP decoder with 
list size L, the scaling exponent p, is the same as that for the 
original MAP decoder with L = 1. Therefore, the speed at 
which capacity is approached as the blocklength grows large 
does not depend on L. Notice that, in general, Theorem [3] 
holds for any set of linear codes whose minimum distance is 
unbounded as the blocklength grows large. 

III. Proof of DI Bounds for MAP Decoding with 
List and W = BEC(e) 

As the name suggests, the DI procedure has two main 
ingredients: the Intersect step is based on the correlation 



inequality stated in Section III-A the Divide step is based 
on the existence of a suitable subset of codewords, which is 



discussed in Section IIII-BI The actual bound for linear and 
polar codes is proved for the simple case L = 1 in Sections 



III-C and III-D respectively, while the generalization to any 



list size is presented in Section III-E 



A. Intersect Step: Correlation Inequality 

Since the BEC(e) is a symmetric channel, we can assume 
that the all-zero codeword has been transmitted. Consequently, 
we map the channel output into the erasure pattern y = 
(?/i j ' ' ' ,Vn) € {0, 1} , with i/i — 1 meaning that the i-th 
BEC has yielded an erasure symbol and y- L = 0, otherwise. 
Let G y be the part of the generator matrix G obtained by 
eliminating the columns corresponding to the erased symbols, 
i.e., all the columns of index i s.t. yi = 1, It is easy to 
check that the MAP decoder outputs the information vector 
u = (iti, • • • , Unr) if and only if uG y = 0. Define E u to be 
the set of all the erasure patterns such that u solves uG y = 0, 
i.e., 



E u = {y € {0, 1} A I uG y = 0}. 



(17) 



Let I u be the set of positions i in which (uG)i equals 1, 
namely, 

I u = {ie{l,--- ,N}\(uG)i = l}. (18) 

Since P e MAP \N , R, e , L = 1) is the probability that there 
exists a non-zero informative vector u that satisfies uG y = 0, 
we have 

P(\jE u )=P™ AP (N,R,e,L = l), 

u£U 

where U = ¥^ R \ nNR <""* " NR 



, : \0 iVK , andO 



denotes a sequence of NR 
0s. 

We start with two simple lemmas computing ¥(E U ) and 
showing the positive correlation between the events ( fTT| . 

Lemma 1 (V(E U )): Let u £ ¥^ R and let E u be defined in 
<[T7). Then, 



P(P„) 



,IAJ 



(19) 



where I u is given by ( p"8j ). 

Proof: Observe that u solves uG y = if and only if 
all the positions i s.t. (uG)i = 1 are erased by the BEC(e). 
Therefore, P(E U ) equals the probability that \I U \ independent 
erasures at those positions occur, which implies ( p"9| ). ■ 

Lemma 2 (Positive correlation between couples): Let 
u, u £ Ff fi , Then, 



P(£ tt n£la)>P(S tt )-P(Sa). 
Proof: By definition ( p~8| ), we obtain 



P(£ u n E fi ) = e 



_ MuUiu\ _ P |/u|+|/s|-|/ u n/ G | 



= e 



(20) 



(21) 



> e |J-l+|J«| = p( jEu ) . p(£Je), 

which gives ( |2"0] >. ■ 

Let us now generalize Lemma [2] to unions of sets. 
Lemma 3 (Positive correlation - BEC(e), L = I): Let 

Ui,U 2 CF™. Then, 

P( |J E u n |J Ba) > P( (J S„) • P( |J Ba). (22) 

The proof of this result can be found in Appendix [A] and 
comes from an application of the FKG inequality, originally 
proposed in p6). 



B. Divide Step: Existence of a Suitable Subset of Codewords 

The aim of this section is to show that there exists 
U\ C U such that f{[j ueU E u ) is slightly smaller than 
iP e MAP (iV,i?,e,i = 1). To' do so, we first upper bound 
¥(E U ) for all u€U. 

Lemma 4 (No big jumps - BEC(e)J: Let P e e (0,1) and 
£ G (0, 1). Then, for any N and R so that 

ln(P e /8) 



d min (N,R)> 



lne 



the probability of E u is bounded by 



V{E U ) < 



P, 



V ueU. 



(23) 



(24) 



Proof: From Lemma [T] and the definition of minimum 
distance, we obtain that 



P(.E„) =£ |Zu| < £ d '-"\ 



(25) 



Using ( |23] i, the thesis follows. ■ 

The existence of a subset of codewords with the desired 

property is a consequence of the previous lemma. 

Corollary 2 (Existence of Ui): Let P e £ (0,1) and 

£ e (0,1). Then, for any N and R so that (|23) and 

P e MAP (7V, #, e, I = 1) > P e hold, there exists UtCU which 

satisfies 



(U E u )> 6 -P™ AP (N,R,e,L = l), 

ueUi 

((J ^)<2 P e MAP (^^^i=l)- 



(26) 



C. Proof of DI Bound for Linear Codes 

At this point, we are ready to present the proof of Theorem 
[TJ for the BEC and for a list size L = 1. Recall that the 
Bhattacharyya parameter of a BEC(e) is Z = e. 

Proof: Pick U\ that satisfies (|26]l and let t/ 2 = U°. 
Consequently, 

^•P e MAP (7V,i?, e ,L=l)<P(|J P u ), 
\ -P™ AP (N,R,e,L= 1) <P(|J P fl )- 



u£U 2 



Hence. 
16 v 



P e MAP (iV,P, £ ,L = l)) 2 <P(|J P U )-P(|J Bb) 



In addition, the following chain of inequalities holds, 

P( |J p u ) • P( (J p s ) < P( (J p u n |J P fl ) 

uec/i iiec/2 «ec/i «ei/2 

= P( |J E„nEi)<P( |J £„nEi), 

where the first inequality comes from the application of 
Lemma [3] and the last passage is a direct consequence of 
Ui n U 2 = 0. Noticing that 

P( |J P u nP s )=P, MAP (iV,P,e,L = 2), 



we obtain the desired result. ■ Lemma 7 (Intersections): For any span(u < ^ 1 ', ■ ■ • , w- 1 ') and 

span(u( 1 ', • • • ,wO), 



D. Proof of DI Bound for Polar Codes 

. i ;<. „..«; ..„ E„„t.,a\ .,(!» f~l P„„/„~.m .-(in 

(32) 



In order to apply the bound to polar codes, it suffices to - t/ sp(u( 1 ),--- ,«(«)) 1 Ji >sp(u( 1 ') ,■■■ ,«(>)) 



prove the lower bound on the minimum distance, as required = B sp („(i) ... , u (0,a(i),... «('))• 

Lemma 5 fd min o/po/ar codes - BEC(e)); Consider a po- Pro f Since span(u( 1 j ' • • • , u«,u«, • • • , u«) d 

lar code in C pol for a BEC(e). Let P e G (0,1), e G (0,1), spanfol V • • ,«W) Uspan(u( >,■ • • ,#), 
and AT > 2 n ^ p °\ where p ,,, , n , n p , ,,, 

n(e,P e ) = 2m(e,P e ) - lne + ^/-4m(e, P e ) ■ lne + (lne) 2 , D -E sp (u(i),..., u co,fiCi),... ,u(0)- 

(27) 

On the other hand, by linearity of the code, for any u E 

span(u^ 1 \ • • • ,u^) and any v € span(u^ 1 \ • • • ,u^), if 
_ px _ , / 21n(P e /8) • ln(l - e) \ uG y = and vG y = 0, then wG y = for all w G {u+v : u G 

m(e,P e i-log 2 |- 7 -|. (28) aTMSTl(u (i) ) . . . yj))^ g span^ 1 ), • • • ,««)}. As a result, 



. 2in(p e /8) \ / span( 

« In. e • In 1 — e in « 



■^ B p(t4Ci),... ,u(0) n P sp (s(i),... ,u(0) 



Sp(u( 1 ),---,U<'),i(l),...,i<!)), 



Then, the lower bound on d m { n ( |23| l holds. <- _g 

Thanks to this result whose proof is in Appendix [B] Theo- 
rem |2| follows from Theorem [T] Comparing (|) with J28), we and the thesis follows. ■ 
notice that, for the BEC, the constraint on N is less tight than As concerns the Divide step, Corollary [3] generalizes the 
the one required for any BMSC. result of Corollary [2] to any list size L. 

Corollary 3 (Existence of Pi): Let P e E (0,1) and e E 

E. Generalization to Any List Size (0, l). Then, for any R and N satisfying 

Set I = log 2 L and define £^(1),... jU (0) to be the set of pMAP/^ R £ L) p (33) 

all erasure patterns y such that the set of solutions of the e ^ ' ' ' ' e ' 

linear system uG u = contains the linear span generated by j -^ m(P e /8) 

r /•-IN CA-i . "mill -^ , j (.J'V 

{u (1) ,--- ,w () }, i-e., ln£ 

n there exists Pj C LS; such that 
E u 

Mes pa„(na),..,n<0) p ( |J P sp( „ (1) ... „ (0 } ) > Pf AP (^ P, £ , L) , 

= {y E {0, 1}^ | wG a = Viiespan(u (1) ,---,« (!) )}- span^t 1 ),- ,uW)ePi 

(29) P( (J ^ p( „ tl) ,..., um) )<^P e MAP (iV,P, e ,L). 
Consider the set LS; containing all the linear spans of F^ fl span^O-),— ,uW)e.Fi 

with 2' elements. In formulas, (35) 

LS ( = {span(u (1) , • • • , u w ) | u w E ¥^ R Vi G {1, • • • , J}, At this point, we can prove Theorem [T] for the BEC and 

|span(v (1) , • ■ ■ ,«W)| = 2'}. for an Y list size L - 

(30) Proof: Pick P x that satisfies |35j and let P 2 = Pf. 

Consequently, applying Lemma |6] and 171 we have 
For the Intersect step, we need now the generalization of 

Lemma [3] whose proof is given in Appendix [O We will also _ . (p MAP (N R e L)) 2 

need the subsequent simple result concerning the intersection 16 

of events ([29). ' " < P( |J ^(.w,.. ,„('))) 

Lemma 6 (Positive correlation - BEC(e), any L): Let span(«( 1 ) : --- ,«W)ePi 

Pi,P 2 C LS ; . Then, itw I I p A 

P( (J ^( U W,...,«C0) span( i W ) ..., a (0) e P 2 

n U %«,.,»«))) span( U W,..,«(«))GP 1 

span(«(i),...,«C0 )e P 2 ^^ n (J £ ; 8 p(fi(i) ) ...,fi(0)) 

>P( |J S SP („(1),.., U (0)) "' ^m(fiCi),... f fi(0) gft 

"H» ( ".".« ( ")^ <P( (J ^.(uM,-.,.«),«M,.,4W)) 

■P( (J ^(fiW,..,fi(0))- ^aa(uW,...,««) 6 Pl 

span(i (1 l,..,i."l)eP 2 



span(u< 1 ),--- ,u('l)6P! 



<P e MAh (JV,i?,E,2L), 



where the last inequality is due to the fact that Proof: By Lemma [9] there exists U[ C U which satisfies 
|span(uW,--- ,uW,uW,--- ,u®)\ > 2 l+1 = 2L, since 

PinP2 = ' H P(|J e' u )>Ip^(n,r,w,l = i), 

IV. Proof of DI Bounds for MAP Decoding with ueU i /^ 

LIST AND ANY BMSC p( (J E 'J<±P™ AP (N,R,W,L = 1). 

A. Case L = 1 ueu[ 

Since the information vectors are equiprobable, the MAP 
decision rule is given by Let ^2 = \U\) C - Then, 



u = argniaxp(y|{t). 3 

16' 



P e MAP (A, R,W,L = l)) 2 < P( |J E'J ■ P( |J El). 



Define £^ as the set of all y such that p(y\u) > p(y|0 JV ' R ). ' 2 



Simple algebraic manipulations show that 

AT 



K = {y e ^ I £ m *(*'(ffi> > 0} 



In addition, using Lemma [8] and the fact that U[ and U' 2 are 



(36) P(U K).P((J EL)<F(\J E' u n |J #«) 

= {yg^[ ^Jln ' >0}, «ei7j «e^ «6^ «e^ 

ie/ " <P( |J <n^) = P e MAP (7V,i?,iy,L = 2), 

where A is the output alphabet of the channel and I u is defined u,ueu,u^u 
in ((18]|. 

Since it is clear that which gives the desired result. ■ 

U„ . „ Lemma flOl generalizes the result of Lemma |5l showing that 

T/l \ pMAP / IU T) 11/ J __ 1 \ I 1 ° I— I ° 

a u) r e \ii i jx, vv , -u j.;, ^ or jy ^jg gnQug^ me required lower bound on the minimum 

distance holds. Hence, the DI bound and the subsequent 



ueu 



we are going to follow the same procedure as in Sections [Tll-A| scaling result are true for the class of polar codes. 
III-C| to prove Theorem [TJ for the case L = 1. Lemma 10 (d min of polar codes - BMSC): Let 



to 



As concerns the Intersect part, we generalize the inequality P e g (0,1), Z € (0,1), and N > 2 n ( z,C ' P °\ where 
of Lemma |3] with the following correlation result, whose proof n(Z, C, P e ) is given by ((8). Then, the lower bound on d min 
can be found in Appendix [P] ( [38] ) holds. 

Lemma 8 (Positive correlation - BMSC, L = 1): Let 
U[,U^ cFf. Then, 



( |J E' u n |J EL) > P( |J E'J ■ P( |J 2%). (37) 



B. Generalization to Any List Size 

Let E' {1) {L) be the set of all y such that p(y\u) > 
p(y\Q NR ) for all ue {u (1) , • • • ,w (L) }, i.e., 



For the Divide step, we need to show that P(E' U ) can be 

made as small as we want, as done in Lemma |4] for the events _,, ^-n -,, r . N , 

Lemma 9 (No big jumps - BMSC): Let P e G (0,1) and " £{ ° W '"'' ,,(I1} (42) 

Z G (0, 1). Then, for any A and i? so that ^ pfa 1(^)0 > n w c r (l) (£)u 

, ,„ ln , 2-^ m n („.\c\\ - v m e {u ,■■■ ,u \\. 



d m in(N,R)> l ^f^, (38) 

InZ 



i=\ 



p(w|o) 



, , ... . , , Consider the set SSl containing all the subsets of i distinct 

the probability of E' can be bounded as . c „ NR T , . 

r J u elements of lb 2 ■ In formulas, 

PM< T' yUEU - (39) SS^ {{ ^,..,^)}:^eFrV, e{ l,..,L } , 

Proof: It is possible to relate the probability of E' u and uW ^ n") V i 7^ j}. 

the Bhattacharyya parameter Z of the BMSC W as (TJ (43) 

P(-E') < ^ |/ul - (40) , , 

Following similar steps to those of Section III-E we gen- 

Since |/„| > d min > ln(P e /8)/lnZ, the thesis easily follows, eralize Lemma [8] with the result below, whose proof is in 

■ Appendix [F] 

Now, we are ready for the proof of Theorem [TJ with L = 1. Lemma 11 (Positive correlation - BMSC, any L): Let 



P{,PiC SS Z . Then, 

P( U Kw,...,«w 

n U ■ E fi(i),...,fiw) 

II / (44) 

{«<!), ••• : «( i )}eP 1 ' 

■P( U 3iCi),., fi CX.)). 

The proof of Theorem TJ follows. 

Proof: By Lemma [9 there exists P{ C SSz, such that 

p ( U K (1 >,.. )UW )>^ MAP (^^)> 



Pick Pj = (P{) c - Consequently, applying Lemma 11 we have 



^■(P^(N,R,W,L)) 2 

<n U ^),-,^) 

{«<!), •••,«( i )}eP 1 ' 
■ p ( U ^),..,fi(0) 

< P ( IJ ^),...,uW 

(tiW.-.nC-llEP; 

{«(!), •••,ii( I -)}eP^ 

< p( U Kdi 

{«( 1 »,-,u( 1 »}6p; 

where the last inequality is due to the fact that 
|{u (1) ,--- ,u^ L \u^\--- ,u (L) }| >L + 1, since P[<1P^ = 0. 



u( I ),SW,-,ii( 1 )/ 



new information which can lead to the elimination of some of 
the existing decoding paths, whereas the genie-aided decoder 
cannot take advantage of this new information. Therefore, the 
SCL decoder always succeeds when the genie-aided decoder 
succeeds, but in addition it might also succeed in some cases 
where the genie-aided decoder fails. 

A. Correlation Inequality 

Let y G {0, 1} N denote the erasure pattern of the channel 
and let p (i G {1, ••■ ,N\) be the set containing all y such 

(i) 

that Wpj is erased, i.e., 

Fi = {y G {0, 1} N I W$ erases}. (45) 

Denoting by T the set of frozen positions, it is clear that 
P( (J F^ = P* C (N, R,e,k = 0). (46) 

Following a similar proof to that presented in Appendix [A] 
it is possible to show the correlation inequality below. 

Lemma 12 (Positive correlation between erasures - k = 0): 
Let I U I % C{1,--- ,N}. Then, 

p ( U F * n U F J ^ p ( U F >) ■ p ( U F *)- < 47 > 

iG/i 16/2 iePi feP2 

In general, define F^ ... ^ k to be the set of all the erasure 
patterns such that W^' erases for all i G {io, ■ ■ ■ ,ik}> i-e., 

F io ,... , ik = {y G {0, 1} N I 4° erases V i G {z , • • • , »*}}, 

(48) 
and consider the set of positions SP^ containing all the subsets 
of k distinct elements of F c , 



SP/c = {{i , • • • , ik} ■ i m G T c Vm G {0, • • • , fc}, 



*m 7^ *n V TO ^ n}. 



(49) 



It it clear that 



V. Further Results for Genie-Aided SC Decoding 
and W = BEC(e) 

Consider a polar code in C po \ transmitted over a BEC(e), 

(i) 

denote by W^ the z-th synthetic channel, which is a BEC 
of Bhattacharyya parameter Zi, and let P^ c (N,R,e,k) be 
the block error probability under SC decoding aided by a k- 
genie. More precisely, when we reach a synthetic channel that 
is erased, the genie tells the decoder the value of the erased 
bit, and it does so a maximum of k times. 

Note that a fc-genie-aided SC decoder and a SCL decoder 
with list size 2 behave similarly but not identically: if W N is 
erased, the former is helped by the genie, while the latter splits 
all the active paths. However, when we reach the synthetic 
channels associated to the frozen bits, the SCL decoder gains 



p ( IJ F i0 ,.., ih )=P? c (N,R,e,k). 

In addition, with a small effort we generalize the result of 
Lemma [12] following the line of thought exposed in Appendix 


Lemma 13 (Positive correlation between erasures - any k): 
Let R U R 2 cSP fc . Then, 

p ( U *io,",i»n |J F h ,... tih ) 

{io,~ ,ik}£Ri {to,- ,ik}eR 2 

>P( |J P ,...,,J-P( |J F h ,.. ih ). 



{io,— ,ik}£Ri 



{io,--- ,*fc}eP2 



(50) 



B. DI Bound 

For the sake of analytical tractability, the sum of the 
Bhattacharyya parameters of the unfrozen synthetic channels, 
namely J^ief Zi, has been object of accurate study and 
regarded as an accurate proxy for the block error probability 
under SC decoding. Following |14|, we talk of strong reliabil- 
ity condition when, in order to guarantee a certain performance 
level, we fix £, e ^c Z u instead of P* C (N, R,s,k = 0). Such 
a requirement is strong, in the sense that, applying the union 
bound to d46) and observing that P(Fi) = Zi, one obtains 



P! c (N,R,e,k = 0)< J2 Z i- 



(51) 



ieJ rc 



However, numerical simulations and, recently, theoretical re- 
sults (17) point out that the RHS and LHS of (|5TJ are pretty 
close. 

Consequently, we can fix a lower bound on P^ C (N, R, e, k), 
say P e , and, at the same time, an upper bound on J2ieJ rc ^i, 
say 1. Indeed, the former inequality ensures that the block 
error probability of the /c-genie-aided SC decoder does not 
become too small, while the latter allows us to make each 
single Bhattacharyya parameter Zi (i £ F c ) as tiny as we 
want, provided that a blocklength large enough is taken, as 
stated in Lemma 14 Under this mild further assumption, a 



bound similar to that of Theorem [2] holds for genie-aided SC 
decoding. 

Theorem 4 (DI bound - genie-aided decoding): Consider 
the transmission of a polar code in C po \ over a BEC(e) and 
fix P e £ (0, 1). Pick N big enough and any R that ensures 



E 



1> > Zi >P^(N,R,E,k) >P e 



(52) 



Then, the performance of the k 
is lower bounded by 



1-genie-aided SC decoder 



P? C (N, R, s, k + 1) > — • (P e sc (iV, R, e, k)) 



16 



(53) 



By induction, the corollary below easily follows. 

Corollary 4 (DI bound - genie-aided decoding, any k): 
Consider the transmission of a polar code in C po i over a 
BEC(e). Fix P e £ (0, 1) and consider the recursion (jTTJ with 
the initial condition P e (0) = P e - Pick N big enough and 
R such that ( |52"| > holds with P e (k) instead of P e , Then, the 
performance of the k + 1-genie-aided SC decoder is lower 
bounded by 



3 sc 



(N,R,e,k+l) 
2fc+ i_ 
> 



(a)' 



(P* c (N,R,e,k = 0)Y 



(54) 



The Intersect step is based on the correlation inequality dis- 
cussed in Section V-A| The Divide step requires the existence 
of flj C SP fe , such that P(U {io ,... , ife}efll F i0t ... >ih ) is slightly 
less than ^P^ c (N,R,e,k). To prove this fact, we first show 
that, choosing a suitably large blocklength, P(Fi) can be made 
as small as required. 



Lemma 14 (No big jumps - Bhattacharyya parameters): 
Let P e € (0, 1) and e <E (0, 1). Then, for N big enough and 
R ensuring (|52|), the following bound holds, 



P(Fi) < 



P 



Vie T c 



Proof: Suppose, by contradiction, that 



raaxP(.R) = maxZ, = a > 



P 



(55) 



(56) 



The number of Bhattacharyya parameters falling in any fixed 
interval [a, b] is ss N@ for some (3 > |18|. Because of 
( |56l >, all the Bhattacharyya constants falling into the interval 
[P e /16, a] correspond to unfrozen channels. Hence, choosing 
N big enough, the hypothesis 1 > J^ief Zi is violated and 
the thesis follows. ■ 

Corollary 5 (Existence of R\): Let P e £ (0,1) and e £ 
(0, 1). Then, for N big enough and R ensuring |52), there 
exists i?i C SPfc such that 

H (J F i0t ... !ik )>~P! c (N,R,e,k), 

{io,--- ,ifc}£-Rl 



u 



(57) 



{io,-" ,ifc}S-Ri 



F i0t ..., ik )<-P^(N,R,e,k). 



Eventually, the proof of Theorem [4] is analogous to that at 
the end of Section III1-EI 

C. Scaling Exponent 

Roughly speaking, Theorem [4] implies that the scaling 
exponent cannot change under SC decoding for any fixed 
number of helps from the genie, provided that the sum of the 
Bhattacharyya parameters remains bounded. This statement is 
formalized in the following theorem. 

Theorem 5 (Scaling exponent - genie-aided decoding): 
Consider the set of polar codes C po \ transmitted over a 
BEC(e). Assume that the scaling law of Definition fl] holds 
for the SC decoder with mother curve / and scaling exponent 
\x. If in the regime N — > oo : 7V 1 /' i (C — R) — z the sum of 
the Bhattacharyya parameters J^ief Zi is bounded, then for 
any k £ N, 

limsup P^ c (N,R,e,k)<f(z), (58) 

Af-yoo: N 1 /^(C~R) = z 



lim inf P, 

N^oo: N 1 /v(C-R)=z 
3 \ 2 fc -l 



MAP 



(N,R,e,k) 



(59) 



> [ V6 ) • (mr ■ 

VI. Concluding Remarks 



In this paper, the scaling exponent of list decoders is ana- 
lyzed with an application to polar codes. By means of a Divide 
and Intersect procedure, we lower bound the error probability 
under MAP decoding with list size L for any BMSC and for 
any set of linear codes with unbounded minimum distance, 
and, specifically, for the set of polar codes. As a result, we 
deduce that under MAP decoding the scaling exponent is a 
constant function of the list size. The techniques developed to 



show this main result can also be applied to the analysis of 
genie-aided SC decoding. 
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Appendix 

A. Proof of Lemma H\ 

Proof: Consider the Hamming space {0,1}^. For y, z E 
{0,1}^ define the following partial order, 

V < z <=> Vl < Zi , Vi€{l,2,---,J\T}. (60) 

Define yVz and y A z as 

(2/Vz), 

(!/Az), 






if yt = zi = 0, 


1 


else, 


1 


if yi = Zi = 1, 





else. 



(61) 



Just to clarify the ideas, think of y e {0, 1} as an erasure 
pattern, as specified at the beginning of Section |III-A| Since 
the N copies of the original BEC(e) are independent and 
each of them is erased with probability e, we consider the 
probability measure defined by 



h(v) 



f e \ w h{v 



JV 



(62) 



where wjj(-) denotes the Hamming weight. 

As wn(y Vz) + Wh(v A z) = Wh(v) + wh{z), we have 



P(y) • f(z) = P{y V z) ■ P(y A z). 



(63) 



For any U\ C F 2 
{0, 1}, defined as 



NR 



consider the function / : {0, 1} 



N 



uQU t 

where E u is defined in ([17) and l{j,e_E u } = 1 if and only if 
y E E u . Consequently, if there exist u £ U\ s.t. uG y = 0, 
then f(y) — 1; f(y) = 0, otherwise. Hence, 



E[/(y)] = l-P(/(y) = l) + 0-P(/(y) = 0) 



( U E -)- 



If f(y) < f( z ) whenever y < z, then /(•) is said to be 
monotonically increasing. If y < z, then the erasure pattern z 
contains all the erasures of y (and perhaps some more). Thus, 
if f(y) = 1' men f( z ) = 1- Since /(■) can be either or 1, 
this is enough to show that the function is increasing. 

Analogously, for any U-% C F^, consider the function g : 
{0, 1} N -► {0, 1} defined as 

g{y) = i- II C 1 " 1 {ve^})' 

ueu 2 



The function g(-) is increasing and its expected value is given 

by 

%G/)] = n U B «)- 

uef7 2 
In addition, 

E[/(y)s(i/)]=P(U £«n |J ^). 

The thesis thus follows from the version of the FKG inequality 
presented in Lemma 40 of (T9j. ■ 

B. Proof of Lemma [5] 
Proof: Define 



C(P e ,e) = 



ln(P e /8) 
lne 



and let G = [<7i, g%, • • ■ , </jv.r] T be the generator matrix of the 
polar code of blocklength N, rate R, and minimum distance 
d min for the BEC(e), Then, by Lemma 3 of J20), 

dmin = min w H {gi). 

l<i<NR 

Setting n = log 2 N, we need to show that for n > n(e, P e ), 

w H {g l )>C{P e ,e), i = 1,2,- ■■ ,NR. (64) 



Suppose, by contradiction, that ( |64j > does not hold, i.e., there 
exists a row gi s.t. for n > n(e, P e ), 



w H (gi)<C(P e ,e). 



(65) 



Since G is obtained from Gjv by eliminating the rows corre- 
sponding to the frozen indices, gi is a row of Gn, say row of 
index i'. Then, by Proposition 17 of flO) , 

where b^ 1 ' = (6^ , 0% , ■ ■ ■ , b„ ) is the binary expansion of 
i' — 1 over n bits, 6^ ' being the most significant bit and 6„ 
the least significant bit. Consequently, d65) implies that 



Y / bf ) <\log 2 C(P e ,e)]=c(P e ,e), 

3=1 

i.e., the number of Is in the binary expansion of i' — 1 is upper 
bounded by c(P e ,e). 

The Bhattacharyya parameter Zy of the i'-th synthetic 
channel is given by 



where o denotes function composition and 

/o(x) = l-(l-x) 2 , 



(66) 
(67) 



Notice that /o(-) and /i(-) are increasing functions Vi £ 
[0, 1], and that /i o f (x) >/ o /i(a;) Vi£ [0, 1], Conse- 
quently, if we set m — wh^ 1 ^), the minimum Bhattacharyya 
parameter Z m i n (m) is obtained by applying first the function 
/i(x) m times and then the function fo(x) n — m times. The 



maximum Bhattacharyya parameter Z max (m) is obtained if C. Proof of Lemma \6\ 

we apply first the function /„ (x) n~m times and then the Pmof . As in the proof of Lemma [3]presented in Appendix 

function A (x) m times. Observing also that for all t e N, mc 0nsider me Hamming space {0, lp with the partial order 

k °/o°"-/oW = l-(l- x) 2 ' , (68) ©■ For V> z e (°. i}^ define y V z and y A z as in |6T) and 

v « ' take the probability measure (J62]) which satisfies ( |63| >. For any 

f timeS t Pi,P 2 C hCi, pick / : {0, 1} N -> {0, 1} and g : {0, 1}* -> 

/l o /i o • • • /i (a;) = a: 2 , (69) {0,1}, defined as 



f times 



we get 
with 



f(y) = l - II (i- 1 ^^),...,^),})' 

span(u( 1 ).--- ,«C'))ePi 

^min(m) < Zi> < Z max (m), (70) 

span(«( 1 ),--- ,u('))eP2 



^min(w) — 1 (1 e ) , where -E , sp ( u (i),... , U W) i s given by |29| and 

ZmaxM = (1 - (1 - e) 2 ^ m ) 2m . 1 {v6 B «pC«( 1 >.-..«)> = l ^ ^ ^ ^ V ^ E M^\- ,«»)• 

Hence, 
Since /i(a;) < /o(a;) V i e [0, 1] and m < c, we obtain that 

UW>W0- (7i) E ^ = p ( U Ert«»,~,«m)> 

miiu I- mmw spa n(u(i) ,- ,«« )ePi 

At this point, we need to show that for k sufficiently large, £[g(y)] = p( (J E sp(iiW ,... , a w)), 

^min(c) > Z max ( C + fc). (72) spantuCD,... ,fi(0) e P 2 

As 1 - (1 - e) 2 " -0- * < l, the condition (72) is satisfied if E[/(i/)fl(iO] = p ( U ^ sp (n(D,.. ,„(«)) 

spanfut 1 ).--- .uTOlG-Pi 

1- 1-e 2 2 >1- l-£ 2 , I, F , 

which after some simplifications leads to S pan(«( 1 ),--- iW)EP 2 

t > i / ln(l — e) \ Since /(■) and <?(•) are increasing, the thesis follows by 

~ ° g2 Vln(l-e 2c )y' Lemma 40 of (19). ■ 

Notice that the RHS of ( |73] > is an increasing function of c. „ . . r^i 

As c< log 2 (C) + 1, we deduce that the choice ' '-' 

Proof: Consider the binary relation < over the output 

alphabet A of the channel, defined for all j/,, zt £ A as 

ln(l-e '""inr"' )/ 

(?4) Vi<Zj^ P{VM) < p(Zt|1) . (75) 

also satisfies f72"] i. ' p(yi\0) p(zi\0) 

An immediate consequence of inequalities ( f70] i, ( |7T| , and . 

( f72| i is that £,/ > Z max (c + k). Therefore, we can conclude The relation < is transitive and total. As concerns the anti- 

that every channel of index j with > c + k ones in the binary symmetryi £ satisfies the property if the following implication 

expansion b^> of j — 1 has Bhattacharyya parameter Zj < Zy. holds f or a ii ,. f j 

Consequently, all these channels have not been frozen and, as 

R<C=l-e, P(y«|l) _ PN 1 ) . „ _ _ n ^ 

c+ ,_ v PivM-pizM^^-* 1 - (6) 

2 ( . ) Note that, without loss of generality, we can assume that the 

e < 1 _ J? = # frozen channels ^ i=0 channel output identifies with the log -likelihood ratio, see pi. 

# channels 2™ With this assumption of using the canonical representation of 

< exp / -("-2( c + fc - 1)) \ ^ t he channel, {76]) is also fulfilled. Hence, < is a total ordering 



ln(l - £ 

log2 'in(r3^ 



2C\ 



( ln(l - e) 

lo g 2 — 2MP.;,) 



2» 



over .4. 



where the last inequality is a consequence of Chernoff bound Set £ = A N and for any y = (yi,--- ,y N ) and z 

— ft, ii*- 1 a t u . ^ ^ -/■ d ^ Ui, • • • , zjv) in £ define the binary relation < as 

After some calculations, we conclude that for n > n{e, P e ), ^ ' ' ' J — 



where n(e,P e ) is given by g7J, £ ^ 

y<z^^yi<z u Vj e {1, ••• ,iV}. (77) 

/-(n-2(c + fc-l)) 2 \ 

exp — ^ ^- '-J- ) <s, c 

V 27i / It is easy to check that < is a partial order over the A^-fold 

which is a contradiction. ■ Cartesian product A N . 



For any y, z G C, denote by y V z their unique minimal 
upper bound and by y A z their unique maximal lower bound, 
defined as 

(?/Vz) ! =max()/ i ,z i ), V i € {1, • • • ,N}, 

A 
< 

(y A z)i = min(y.j, Zi), V i € {1, • • • , N}. 

A 
< 

Since the alphabet A is finite and the distributive law holds, 
i.e., 

y A (z V w) — (y A z) V (y A «;), V y, z, id, e £, 

the set £ with the partial ordering < is a finite distributive 
lattice. Observe that in the proof of Appendix [A] the finite 
distributive lattice C is replaced by the Hamming space 
{0,1}* 

Let li : C — > K + be defined as 



Using the same argument seen for the function /(•), one 
realizes that g(-) is an increasing function. 
By the FKG inequality (22), 

v(y)f(y) ■ Yl v{y)9{y) < Y Kv)f(v)9(y) • Y ^)- 



yec yec 

Observing that 



yec 



yec 



Ky) = p(y\° 



NR\ 



(78) 



In words, //(•) represents the probability of receiving the N- 
tuple y from the channel, given that the all-zero information 
vector 0^^ was sent. We say that such a function is log- 
supermodular if, for all y,z E C, 



Ky) ■ K z ) < Kv A z ) • Kv A z )- 



(79) 



An easy check shows that ( |79| l is satisfied with equality with 
the choice f78] l. Notice that in the proof of Appendix [A] the 
log-supermodular function /i() is replaced by the probability 
measure (|62j». 

For any U[ C F^, consider the function / : C ->• {0, 1}, 
defined as 

f(y) = i- n^ 1 " 1 ^^})' 

ueu[ 

where E' u is given by po} and t{ yeE , y — 1 if and only if 

c " 
y € -E„. If /(y) < / (2) whenever y < z, then /(•) is said to 

be monotonically increasing. Since /(•) can be either or 1, 
we only need to prove the implication f(y) = 1 =>■ f(z) = 1, 

£ 

whenever y < z. If /(y) = 1, there exist u* £ U[ such that 

o< J2^ p{yill) 



ye£ 

5>(i/)/(v)=p(U O' 

s/e£ uec/; 

5>(y)y(y)=P(|J 2%), 

J2v(y)f(y)g(y) =F( U < n U ^)< 

ye£ ueu{ fiei/^ 

we obtain the thesis ( [37} . ■ 

£. Proof of Lemma U0\ 

Proof: Following the approach of Appendix IB] suppose, 
by contradiction, that there is an unfrozen index i' of Gn, 
such that the number of Is in the binary expansion of i' — 1 
is upper bounded by c(P e , Z), defined as 

ln(P e /8)" 



c(P e ,Z) = 



log 2 



InZ 



The Bhattacharyya parameter Z^ has the following expression 
Z V =f b ^of b ^o.. ./, i0 (Z), 



where fi{x) is given by ( |67] >, and /o(x) can be bounded as 

x /l_(l_ a; 2 )2=/ W (a . ) < /o(x) < /o («) (a;) = l_(l_ :r) 2. 



f (0 



fCO/ 



»6I« 



'p(vi\oy 



A 



Since /i(x) and /q ; (x) are increasing and /q ; (x) < fo(x), 
we have 

where, for the sake of simplicity, we have defined /} '(x) = 
/i(x). Setting m = Wh^ 1 ^) and remarking that /} o 
/o ( x ) > /o ° /i (#)> a lower bound on Z>, is obtained 
applying first the function /{ (x) m times and then the 
function /q (x) n — m times. Using d69| and observing that 



As yi < Zi for all i € {1, • • • ,N}, by definition ( |75j ) we 
obtain 



iei u * ^ vy ' ; iez„ 



^ ln p(2/i|0) 



J»(«*|0)' 



for all t e N, 

/o ( ° % W ° • • • /o W (*) = v/l-(l- o;^ , 

V • v 



t times 



which implies that f(z) = 1. As a result, /(•) is increasing. 
Analogously, for any U% C F^^, consider the function g 
C — > {0, 1} defined as 

ueu' 



we get 



At) ^ 7 (0 



4 J >^„M = \/i-(i-^ 2m+1 ) 2 " 



Since /} (x) < /q (x) and m < c, we obtain that 



32L("») > ^l(c). 



On the other hand, let Z§ be the Bhattacharyya parameter 
of the synthetic channel of index j with > c + k ones in the 
binary expansion b^' of j — 1. Since fi(x) and /g (x) are 
increasing and /g(x) < /g (x), we have 



?(«) _ /•(«) „ f(«) 



(«), 



Z J - Z j - J b U) ° J b U) ° J b U)( Z ) 



(«)/ 



where we have defined for the sake of simplicity /{ (x) 
/i(x). Setting m! — wn(b^) and remarking that /{ 



(a) 



f(«)/ 



<•(«) „ A u ) 



(u) 



fo \ x ) ^ /o °/i (*)> an u PP er bound on Z l J is obtained 
applying first the function /g (x) n — m! times and then the 
function /1 (x) to' times. Using ( |68| i and ( |69] i, we get 

zf ) <z(r a ) x K) = (i-(i-^) 2 "^') 2m '. 

Since /J (x) < /g (x) and m' > c+ fc, we have that 
^(m')<^(c + A). 

At this point, we need to pick fc such that the following 
inequality holds, 

After some calculations, one obtains that 



( ln(l - Z) 



fulfills the requirement. 

As a result, every channel of index j with > c+k ones in the 
binary expansion b^' of j — 1 cannot be frozen. By Chemoff 
bound |21 ], we get a contradiction for n > u(Z, C, P e ), where 
n(Z, C, P e ) is given by (|8j. ■ 



/^ Proof of Lemma UT\ 

Proof: Consider the finite distributive lattice C = A 

c 

with the partial ordering < defined in ( |77| ). Let jj, : C —¥ 

be the log-supermodular function ( |78| . 

For any P'^P'^ C SSl, consider the functions / : £ 

{0, 1} and g : C ->• {0, 1}, given by 

/(y) = i- n ( 1 - 1 ^; (1 , ... u( „})' 

{iW.-.iOlePj 



at 



where £^ (1)] ... j1l(i) is defined in (g2) and l{ !/eB ' (i) (i) } = 
1 if and only if y e £"<i) <£)■ For analogous reasons to 
those pointed out in Appendix |P] /(■) and g(-) are monoton- 
ically increasing. 



Noticing that 

5>G/)/(t/)=p( u Ka,,..,^,), 



y££ 



U ■ E fi(i),...,€iw)> 

{u( 1 ),---,«( i >}ep^ 



the thesis follows from the FKG inequality |22| 



[8 

[9: 

[10 

in 

[12 
[13 

[14 

[15 
[16 

[17 

[18 

[19 

[20 

[21 



References 

T. Richardson and R. Urbanke, Modern Coding Theory. Cambridge 
University Press, 2008. 

A. Barg and G. Forney, "Random codes: minimum distances and error 
exponents," IEEE Trans. Inf. Theory, vol. 48, no. 9, pp. 2568-2573, 
2002. 

A. Amraoui, A. Montanari, T. Richardson, and R. Urbanke, "Finite- 
length scaling for iteratively decoded LDPC ensembles," IEEE Trans. 
Inf. Theory, vol. 55, no. 2, pp. 473^98, Feb. 2009. 
P. Elias, "List decoding for noisy channels," Institute of Radio Engineers 
(now IEEE), Tech. Rep., 1957. 

J. Wozencraft, "List decoding," Research Laboratory of Electronics, 
Massachusetts Institute of Technology, Tech. Rep., 1958. 
C. Shannon, R. Gallager, and E. Berlekamp, "Lower bounds to error 
probability for coding on discrete memoryless channels. I," Inform. 
Contr, vol. 10, no. 1, pp. 65 - 103, 1967. 

G. J. Forney, "Exponential error bounds for erasure, list, and decision 
feedback schemes," IEEE Trans. Inf. Theory, vol. 14, no. 2, pp. 206-220, 
1968. 

R. Gallager, Information Theory and Reliable Communication. New 
York, NY, USA: John Wiley & Sons, Inc., 1968. 

E. Hof, I. Sason, and S. Shamai, "Performance bounds for erasure, list, 
and decision feedback schemes with linear block codes," IEEE Trans. 
Inf. Theory, vol. 56, no. 8, pp. 3754-3778, 2010. 
E. Arikan, "Channel polarization: a method for constructing capacity- 
achieving codes for symmetric binary-input memoryless channels," IEEE 
Trans. Inf. Theory, vol. 55, no. 7, pp. 3051-3073, July 2009. 
E. Arikan and E. Telatar, "On the rate of channel polarization," in Proc. 
IEEE Int. Symp. on Inf. Theory (ISIT), July 2009, pp. 1493-1495. 
S. Hassani and R. Urbanke, "On the scaling of polar codes: I. The 
behavior of polarized channels," in Proc. IEEE Int. Symp. on Inf. Theory 
(ISIT), 2010, pp. 874-878. 

S. Korada, A. Montanari, E. Telatar, and R. Urbanke, "An empirical 
scaling law for polar codes," in Proc. IEEE Int. Symp. on Inf. Theory 
(ISIT), June 2010, pp. 884-888. 

A. Goli, S. Hassani, and R. Urbanke, "Universal bounds on the scaling 
behavior of polar codes," in Proc. IEEE Int. Symp. on Inf. Theory (ISIT), 
July 2012, pp. 1957-1961. 

I. Tal and A. Vardy, "List decoding of polar codes," in Proc. IEEE Int. 
Symp. on Inf. Theory (ISIT), Aug. 2011, pp. 1-5. 

C. M. Fortuin, P. W. Kasteleyn, and J. Ginibre, "Correlation inequalities 

on some partially ordered sets," Commun. Math. Phys., vol. 22, pp. 89- 

103, 1971. 

M. Parizi and E. Telatar, "On correlation between polarized BECs" 

submitted to IEEE Int. Symp. on Inf. Theory (ISIT) 2013, available: 

http://arxiv.org/pdf/1301.5536.pdf 

S. Hassani, K. Alishahi, and R. Urbanke, "On the scaling of polar codes: 

II. The behavior of un-polarized channels," in Proc. IEEE Int. Symp. on 
Inf. Theory (ISIT), 2010, pp. 879-883. 

S. Korada and R. Urbanke, "Exchange of limits: why iterative decoding 

works," IEEE Trans. Inf. Theory, vol. 57, no. 4, pp. 2169-2187, Apr. 

2011. 

N. Hussami, R. Urbanke, and S. Korada, "Performance of polar codes 

for channel and source coding," in Proc. IEEE Int. Symp. on Inf. Theory 

(ISIT), July 2009, pp. 1488-1492. 

[Online] . Available: http://en.wikipedia.org/wiki/Chernoff_bound 



[22] N. Alon, J. Spencer, and P. Erdos, The Probabilistic Method. Wiley, 
1992. 



